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Abstract 

Extracting moving objects from a video sequence and estimating the background 
of each individual image are fundamental issues in many practical applications 
such as visual surveillance, intelligent vehicle navigation, and traffic monitoring. 
Recently, some methods have been proposed to detect moving objects in a video 
via low-rank approximation and sparse outliers where the background is modeled 
with the computed low-rank component of the video and the foreground objects 
are detected as the sparse outliers in the low-rank approximation. Many of these 
existing methods work in a batch manner, preventing them from being applied 
in real time and long duration tasks. To address this issue, some online methods 
have been proposed; however, existing online methods fail to provide satisfactory 
results under challenging conditions such as dynamic background scene and 
noisy environments. In this paper, we present an online sequential framework, 
namely contiguous outliers representation via online low-rank approximation 
(COROLA), to detect moving objects and learn the background model at the 
same time. We also show that our model can detect moving objects with a 
moving camera. Our experimental evaluation uses simulated data and real 
public datasets to demonstrate the superior performance of COROLA to the 
existing batch and online methods in terms of both accuracy and efficiency. 
Keywords: Moving Object Detection, Online Low Rank Approximation, 
Markov Random Fields, Online Background modeling 
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1. Introduction 


Moving object detection and background estimation are fundamental in var¬ 
ious applications of computer vision and robotics such as visual surveillance [I] , 
traffic monitoring [2], vehicle tracking and navigation [5], and avian protec¬ 
tion [3]. Many methods have been proposed to extract objects from a sequence 
of images with a stationary camera 0, 0 or with a moving camera 0, 0, 0. 
These methods can be grouped into several categories. Motion-based meth¬ 
ods m , im use motion information of the image pixels to separate the fore¬ 
ground from the background. These methods work based on the assumption 
that foreground objects move differently from the background. Therefore it is 
possible for these methods to classify pixels according to their movement charac¬ 
teristics even in the case of significant camera motion. However, these methods 
require point tracking to identify the foreground, which can be difficult espe¬ 
cially with large camera motion |12j . In addition, they are limited in terms of 
dealing with dynamic background or noisy data m 

Another popular category for moving object detection methods is back¬ 
ground subtraction HU, which compares the pixels of an image with a back¬ 
ground model and considers those that differ from the background model as 
moving objects. Thus, building a background model plays a critical role in back¬ 
ground subtraction methods. Conventional algorithms for background mod¬ 
elling include single Gaussian distribution HE Gaussian mixture model [161 . 
and kernel density estimation m- These methods model the background for 
each pixel independently and so they are not robust against global variations 
such as illumination changes. 

Recently a new approach to background modelling, namely low-rank matrix 
approximation, has been developed HSUS]. Methods in this approach follow 
the basic idea from m ■ Oliver et al. [50] proposed Eigenbackground subtrac¬ 
tion using PCA [21] (principal component analysis) to model the background 
and detect moving objects. It is based on the observation that the underlying 
background images should be unchanged and the composed matrix of vectorized 
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background images can be naturally modeled as a low-rank matrix. Extending 
this idea, current methods exploit the fact that the background model in an 
image sequence can be defined by those pixels that are temporally linearly cor¬ 
related |22| . By capturing the correlation between images one can naturally 
handle global variations. Algebraically speaking, if an image is vectorized in a 
column and all images are concatenated into a 2D matrix, then the columns are 
dependent and its low-rank approximation matrix represents the background 
model of the images. As a result, the background modeling problem is con¬ 
verted to the low-rank approximation problem. In general, by decomposing an 
input matrix of vectorized images into a low-rank matrix and a sparse matrix, 
the low-rank and sparse matrices correspond to the background model and the 
foreground objects in the image sequence respectively. Our COROLA algorithm 
described in this paper adopts the low-rank approximation approach. We will 
detail representative algorithms in this approach in Section [2] 

Most of the existing background subtraction algorithms based on low-rank 
approximation operate in a batch manner; i.e., all images whose background 
model is to constructed are first collected and then used to build a data matrix 
whose low-rank approximation is computed. This unfortunately limits the ap¬ 
plication of the low-rank approximation approach in terms of its efficiency and 
accuracy. Although existing online methods via low-rank approximation have 
addressed the efficiency issue to some extent, they are not robust against dy¬ 
namic and noisy background. In this paper, we offer an algorithm, COROLA, 
that performs low-rank approximation in a sequential manner so that its com¬ 
putational complexity does not grow with the number of images in the sequence. 
In addition, through image registration, our algorithm is able to handle the case 
of a moving camera due to the adaptive nature of the background model that 
is being learned. The main contributions of this paper are as follows. 

1. We propose an online formulation of the low-rank approximation algo¬ 
rithm for foreground object detection. The proposed formulation enables online 
application without requiring an entire image sequence, as in the batch formu¬ 
lation and is more robust than existing online methods for dynamic background 
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scene or noisy environment. 

2. COROLA uses a fixed window of images to perform low-rank approxima¬ 
tion and so it is appropriate for continuous operation, which cannot be achieved 
by the batch formulation due to matrix decomposition and memory storage. 

3. In the case of significant camera motion, a batch formulation has the 
limitation that the first and the last images of a sequence must be similar to 
find the low-rank matrix. However, in the case of a moving camera, there is in 
general no similarity between the first and the last images in a sequence. Our 
proposed COROLA algorithm does not require a stationary background. 

The remainder of the paper is organized as follows. Related works on fore¬ 
ground detection via low-rank and sparse decomposition are summarized in 
Section [2] Section [3] explains the details of COROLA for foreground detection 
and background estimation, followed by the introduction of our online formu¬ 
lation via greedy bilateral sketch \T7> . Experimental results and discussion are 
presented in Section [4j and concluding remarks in Section [5] 

2. Foreground Detection via Low Rank and Sparse Decomposition 

In recent years, many algorithms have been developed for foreground detec¬ 
tion based on low-rank matrix approximation with robust principal component 
analysis (RPCA) [15]. RPCA decomposes a given matrix D into low-rank matrix 
L and sparse matrix S called outliers. Different techniques exist for low-rank ap¬ 
proximation including principal component pursuit (PCP) [22], augmented La- 
grangian multiplier (ALM) [24j , linearized alternating direction method with an 
adaptive penalty (LADMAP) [2S], and singular value thresholding (SVT) [25] . 
All of these techniques need all the data in order to perform batch optimization 
that computes the low-rank matrix and the sparse outliers. Due to batch pro¬ 
cessing, the following two problems occur: memory storage and time complexity. 
In continuous monitoring tasks or video processing, if matrix D is built with 
a large number of images memory storage will be a problem [27] . In addition, 
by increasing the size of the input matrix D, time complexity for the matrix 
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decomposition is also increasing. 

To address the problem of time complexity, some efficient algorithms have 
been proposed [25J EH HI] • Rodrigues and Wohlberg proposed a fast PCP [25] 
algorithm to reduce the computation time of SVD in inexact ALM (IALM). 
The “Go Decomposition” (GoDec) method, proposed by Zhou et al. computes 
RPCA using bilateral random projections (BRP) [29]. Semi-Soft GoDec (SS- 
GoDec) and Greedy SSGoDec methods [25] are extensions of GoDec to speedup 
it. Although these algorithms reduce the computation time of low-rank approxi¬ 
mation, they still are not satisfactory for applications such as visual surveillance 
and robot navigation due to their batch formulation. In many applications, on¬ 
line processing is critical and batch methods are infeasible. One of the best 
known batch processing algorithms is the “detecting contiguous outliers in the 
low-rank representation” (DECOLOR) method [30] . This method uses a priori 
knowledge of the foreground objects that they should be connected components 
of relatively small size. Using this constraint in the method, DECOLOR pro¬ 
vides promising results; however, due to batch processing, it still suffers from 
memory storage and time complexity problems. Furthermore, in the case of a 
moving camera, the current image is no longer similar to the first images in ma¬ 
trix D , and therefore DECOLOR is not able to detect foreground appropriately. 
In general, batch processing methods cannot operate on a continuous basis and 
cannot deal with a moving camera. Although DECOLOR has introduced an 
implementation for moving camera, it only works for short video sequences with 
small camera motion. 

To overcome the limitations of batch processing methods, incremental and 
online robust PCA methods have developed. He et al. ED proposed Grass- 
mannian robust adaptive subspace tracking algorithm (GRASTA),which is an 
incremental gradient descent algorithm on Grassmannian manifold for solving 
the robust PCA problem. This method incorporates the augmented Lagrangian 
of Zi-norm loss function into the Grassmannian optimization framework to al¬ 
leviate the corruption by outliers in the subspace update at each gradient step. 
Following the idea of GR ASTA, He et al. [32] proposed transformed GRASTA (t- 
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GRASTA), which iteratively performs incremental gradient descent constrained 
to the Grassmann manifold in order to simultaneously decompose a sequence of 
images into three parts: a low-rank subspace, foreground objects, and a trans¬ 
formation such as rotation or translation of the image. This method can be 
regarded as an extension of GRASTA and RASL [33 (Robust Alignment by 
Sparse and Low-Rank decomposition) by computing the transformation and 
solving the decomposition with incremental gradient optimization framework. 
To improve the accuracy of online subspace updates especially for dynamic 
backgrounds, Xu et al. [34] developed an online Grassmannian subspace up¬ 
date algorithm with structured-sparsity (GOSUS) via an alternating direction 
method of multipliers (ADMM). 

To deal with noisy conditions and dynamic background scene, Wang et 
al. [35] proposed a probabilistic approach to robust matrix factorization (PRMF) 
and its online extension for sequential data to obtain improved scalability. This 
model is based on the empirical Bayes approach and can estimate better back¬ 
ground model than GRASTA. Recently, Feng et al. [36] proposed an online ro¬ 
bust principal component analysis via stochastic optimization (OR-PCA). This 
method does not need to remember all the past samples and uses one sample 
at a time by a stochastic optimization. OR-PCA reformulates a nuclear norm 
objective function by decomposing to an explicit product of two low-rank ma¬ 
trices, which can be solved by a stochastic optimization algorithm. Javed et 
al. [37] used this technique for online foreground detection. Their method first 
extracts outliers from each image using OR-PCA and then uses Markov Ran¬ 
dom Field (MRF) to improve the quality of foreground segmentation. However, 
they did not solve the problem of foreground detection within a unified single 
optimization framework, i.e., MRF is only applied once to improve the outliers 
of OR-PCA and without alternating learning to update the OR-PCA. As a re¬ 
sult, the reported performance is not competitive with respect to those in the 
literature. 
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2.1. Relation of our method to other methods 

Since our COROLA method uses the sparsity and connectedness terms of 
DECOLOR method and estimates the background model using sequential low- 
rank approximation with the help of OR-PCA, we present a summary of these 
two methods and in the next Section we describe our COROLA method that 
extends the two methods. 


2.1.1. DECOLOR 


DECOLOR is a formulation that integrates the outlier support and the es¬ 
timated low-rank matrix in a single optimization problem, for joint object de¬ 
tection and background learning. Specifically, it works by solving the following 
minimization: 


minl||P s r(D-L)|||+^ 2 ||5|| 1 +7||$(5)|| 1 
s.t. rank(L) < r, 


( 1 ) 


where D, L,andS are the matrix of vectorized images, estimated background 
images, and outlier support, respectively. S in 0 is binary and its elements are 
1 for outliers. S 1 - is the complement of S and its elements are 1 for background 
pixels of the images. $(5) means the difference between neighboring pixels and 
therefore the last term of the above minimization encourages connectedness of 
outliers. Zhou et al. [3D] solved the first term of ([TJ) with its constraint using 
an alternating algorithm (SOFT-IMPUTE) :3H] • They then solved the rest of 
the minimization problem by Markov Random Field (MRF) [39]. This two- 
step optimization is iterated until convergence. Although this method provides 
promissing results, it still suffers from memory storage and time complexity 
problems in large datasets and, due to batch processing, it is not appropriate 
to operate on a continuous basis. Furthermore, in the case of a moving camera, 
DECOLOR only works for short video sequences with small camera motion and 
cannot deal with a moving camera in general. 
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2.1.2. OR-PCA 

OR-PCA solves stochastic optimization sequentially, processing one sample 
at a time and producing a solution that is equivalent that of the batch RPCA. 
As a result, its computation cost is independent of the number of samples. 
OR-PCA solves the following minimization problem: 

min^HOD - UV - E)f F + ^(| \U\\ 2 F + \\V\\ 2 F ) + A 2 ||i5||i (2) 

where U,R, and E are the basis, coefficient, and sparse error matrices. Feng et 
al. [361 solved © in an online manner for one sample per time by two iterative 
updating parts. First, the coefficients and the sparse error for each new sample 
is updated by the previous basis. Then, the basis is updated using the new 
sample, updated coefficients, and sparse errors. 

In this paper, extending the work of DECOLOR and OR-PCA, we intro¬ 
duce a novel non-convex closed-form formulation for detection of moving ob¬ 
jects named (COROLA). It solves the challenges of memory storage and time 
complexity of |30' and provides more accurate results than [36], especially in 
noisy environments. COROLA is also able to extract moving objects using a 
moving camera on a continuous basis, which cannot be achieved in general by 
a batch processing method especially in the case of large camera motion. 

3. Online Moving Object Detection by COROLA 

In this section, we focus on online detection of moving objects for both static 
and moving cameras. We first formulate the problem of background modelling 
and foreground object detection and then describe in detail our COROLA al¬ 
gorithm, which computes the low-rank approximation and foreground detection 
sequentially. 

3.1. Notations and Formulation 

Let X G R m be a vectorized image and X 7 be the j th image in a sequence, 
expressed as a column vector of m pixels. Then, D = [Xl,...,A„] G i? mX " is 
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a matrix of n images and the i th pixel in the j th image is denoted as Xij. To 
indicate foreground for an observed image j, we use a binary indicator vector 
S = [si, S 2 ,s m ] T as the foreground support where 

! 0 if i is background 

(3) 

1 if i is foreground 

and matrix S = [Si, S 2 ,S n ] shows a binary matrix of all images in D. Also, we 
use the function Vs(X) £ to construct a vector of at most m foreground 
pixels of image X. Note that /o-norm |s|o is the cardinality of S or the number of 
non-zero elements in S. In a matrix with more than one column, Vs. : constructs 
multiple columns each by applying Vs to a column in the input matrix. Now, 
let L = UV. The objective function in 0 can be rewritten as follows. 

minJ|| V s x(D - UV)\\ 2 f + ySall^lU + tII^IIi 

uy, s z (4) 

s.t. rank(U ) = rank(V) < r, 

With the above notations and equations, and by relaxing the constraints 
of 0 based on , the problem of background modelling and foreground object 
detection via sequential low-rank approximation and contiguous outlier repre¬ 
sentation solves the following optimization problem for each observed image. 


min \\\Vs{X - Uv)f F +/3 1 \\V s ,m 2 F + Pi\Mf + ft||s||i + 7ll$(s)||i (5) 

U,v,s Z 

where X £ R m is an observed image, r is the upper bound on the rank of the ba¬ 
sis matrix U £ R mxr , and V £ R r is a coefficient vector. $(s) means the differ¬ 
ence between neighboring pixels and it is computed by ||<I>(s)|| 1 = I s * ~ s *l 

(i,k)G£ 

and £ is the neighborhood clique. Note that the objective function defined in 0 
is non-convex and involves both continuous and discrete variables. Since 0 is 
our online formulation for each input image, the loss over all data would be the 
cumulative for each image. The first three terms try to compute the low-rank 
representation of input image X by first expressing it as a linear combination of 
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the background basis U and its coefficient vector V, and then penalizing only the 
foreground pixels using extraction function Vs- The last two terms of (|5j) find 
continuous and small outliers to represent the foreground mask. Specifically, the 
fourth term imposes a sparsity constraint on the foreground mask S; i.e., the 
foreground pixels should be low in number. The last term imposes a connectiv¬ 
ity constraint on mask S to account for correlation between neighboring pixels 
of an image. By minimizing © we can estimate the best low-rank representa¬ 
tion of an input image and detect foreground objects, concurrently. However, 
solving this joint optimization in one step is difficult. Therefore, people use 
a two-step alternating optimization procedure by separating it to a low-rank 
approximation step involving U and V, and then a contiguous sparse optimiza¬ 
tion step involving s to obtain background estimation and foreground detection, 
performed alternatively. In the first step people treat |5]) as minimization over 
U and V, for which we introduce an online approach via the greedy semi-soft 
GoDec (Gre-SSGoDec) and OR-PCA methods rather than the SOFT-IMPUTE 
algorithm [38] in batch methods. In the second step, minimization over S is 
conducted. In addition, we use the combination of Gaussian Mixture Model 
(GMM) and first order MRF with binary labels in the second step to improve 
the foreground detection performance. 

3.2. Online Low-Rank Approximation 

For solving the first step of ([5]), we describe in this section our sequential 
method to compute the low rank background model of an image sequence and 
the foreground as its sparse outliers, in a way that is suitable for continuous and 
real time operation. In our sequential formulation, we adopt an online updating 
approach for optimization over U and V. Therefore © can be rewritten as: 

min h\Vs(X-Uy)\\ 2 F +P 1 \\Vs,(U)\\F + Mn 2 2 (6) 

U,v 2 

Since ^ updates subspace of U based on foreground mask s, we rewrite the 
objective function for the rest of this section as follows. 

min h\X~Uw\\% + ^\\U\\l + Mw\\l (7) 

U,v & 
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where U = Vs,-.(U) and X = Vs{X). 

Initialization Step'. With a small number of images at the beginning of a 
sequence no fewer than the rank of the background model, we initialize U with 
a batch method. This enables us to estimate the rank r roughly for the images in 
the rest of the sequence. Since this step is performed only once, the complexity 
of using a batch formulation is not an issue. After the initialization of U , for each 
input sample X , we use an incremental approach to solve |?| by the following 
two parts, repeatedly. These two parts update V, and then U (by updating the 
subspace of U) for each sample to build the background model incrementally as 


follows: 


Part 1: Because every two consecutive images in a sequence are similar, we 
can update coefficient vector V (or U) for the current image via background 
model U (or v) computed for the previous image. To update V with the fixed 
U, @ becomes: 


V = argrnin }-\\X - Uv )||| +/3i||v||| 

V ^ 


( 8 ) 


where X £ R m is the current image and X = Vs{X). By fixing t/, (|8j) is a least 
squares problem and can be solved by 


§ 


v = (U T U)^U T X 


(9) 


where (.)! is the Moore Penrose pseudoinverse [231 . 
Part 2: To update U, can be rewritten as: 

' 1 -\\X-Uwf F +prWUfp 




( 10 ) 


and, according to Frobenius norm properties, (101 can be solved by: 


U = argrnin ^Tr[U(A + ft I)U T ] - Tr{U T B) 

ft ^ 


u 


( 11 ) 


where A = VV and B = Xv . U means we update U for those pixels that 


have foreground mask s* = 1. Since U is the basis of background for all images, 
it cannot be computed independently. This constraint of updating for A and 



B has been dealt with in [36| , where the basis U minimizes a cumulative loss 
w.r.t the previously estimated coefficients V. Therefore, we use the following 
cumulative form to update A and B , before computing U for the first iteration. 


A = A + VV 
B = B + Xv 1 


( 12 ) 


These accumulative forms enable us to use the previous background models to 
compute the current U and keep the background model more stable against 
unexpected changes by increasing the number of images through time. In con¬ 
trast to w e update 13, only for those pixels that have foreground support 
Si = 1. Therefore, the number of rows in B is variable and equal to |s|o in each 
iteration. In this part additive A and B save all previous information of U and 
V and are updated for the current image. By increasing the values of A and B, 
the obtained background model becomes stable. 

For the first iteration, Si = 1 for all pixels of the current image and so the 
number of rows in B and U is the same as that in the input image; subse¬ 
quently, the number of rows in B and U decreases in succeeding iterations as 


the foreground area is decreased. (11) can be solved with a simple iterative algo¬ 
rithm presented in [35]. Since COROLA is an iterative algorithm based on (|5j) 
and the size of U and B changes in each iteration, in this implementation we 
save their values of 1 [7, 1 V, X A, and 1 B after the first iteration. We use these 
values in the first iteration of the next input image. Also these variables have 
the most information for building the background model of the current image, 
which is computed by L = 1 [/ 1 V. However, foreground detection depends on 
the obtained mask S from the second step of solving ([5|, and the algorithm 
continues to iterate until the convergence criteria are met. Because for dynamic 
backgrounds, outliers are a combination of the foreground object and moving 
parts of the background as noise (e.g., waving trees). These moving parts do 
not affect background model, but they create false positives in the foreground 
mask S. We will explain the convergence criteria after solving the second step 
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of @. 


3.3. Online Foreground Detection 

Let current X and its corresponding L be Xj and Lj, respectively. Also 
Sj is the indicator vector s for the j th image. Now we investigate how to 
compute the foreground mask s given the residual Ej = Xj — Lj (Lj is computed 
in background modeling in the previous section for the j th observed image). 
The goal now is to find the indicator vector Sj on Ej. Assuming that the 
foreground objects are relatively small connected components, we can model 
the foreground mask Sj by a Markov Random Field (MRF) [39] . Specifically, 
let graph Q = (V, £) where V is the set of vertices that correspond to the pixels 
of an image and £ is the set of edges that connect neighboring pixels. Then, by 
defining an energy function of Sj 


(i,k)e£ 


(13) 


iev 


which is called “Icing model” in the literature and an example of MRF, we 
can derive the foreground mask Sj. The first and the second terms impose 
sparsity and continuity on Sj, in a way that is similar to the last two terms 
of ([5]) and shows that Sj can be modeled using MRF [5DJ. However, extracting 
foreground objects from E, which is combination of outliers and noise, would 
not be accurate especially in noisy environment like dynamic backgrounds or 
with a moving camera. In most cases we need to separate reliable outliers 
representing true foreground from noise in estimating foreground support Sj. 
In most applications, noise comes from a complicated and dynamic background 
such as waving trees or sea waves, which should be classified as background. 

Here, we describe outliers with a Gaussian model Af(n, a 2 ). Using this model 
of the outliers enables us to control the complexity of the background variations 


and also recognize true outliers in the presence of noise using (141. In our study, 
adaptive Gaussian Mixture Model (GMM) [3D] is used for each component of E 
to separate the outliers from noise. As in most cases, three Gaussian components 
are sufficient in modeling E to separate foreground F from noise [40] . Fig. |T] 
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Figure 1: The effects of using GMM on outliers obtained from low rank approximation on 
noisy and dynamic background. The left figure shows an input image, and the middle and 
right figures show the obtained outliers E and E, respectively. 


shows the effect of using GMM on E for dynamic backgrounds. The middle 
figure shows the obtained residual E. After obtaining E, we normalize it and 
extract outliers F from noise using Gaussian model (right figure). So, to solve 
the second step of ([HJ), we construct E with a simple update rule as follows: 


Ej = aEj + (1 - a)Fj (14) 

where Ej = Xj — Lj and Fj is the outliers using GMM on the current image ( j th 
image of the sequence), a £ [0,1] is a constant that controls the magnitude of 
noise so that a small a would be used for noisy data (i.e. for moving cameras). 
In all of our experiments a = 0.1. 

Now we can solve the second step of our optimization problem that extracts 
moving objects from outliers, and (J5| can be rewritten as the following objective 
function to minimize the energy over Sj via obtained outliers E. 

mm 1\\V s (E)\\ 2 f + 7ll*(S*)||i + C 

(15) 

= E E?+p 2 j2s i +'y\ms J )\\ 1 +c 

i:Si = 1 i 


where C is a constant. The first term of (15) is constant and therefore (15) 
is the first order MRF with binary labels (the same as ([l3|)), which can be 
solved using graph-cut [3T], [32]. The result of ( |l5| ) is the binary mask Sj, 
which indicates the foreground pixels of Xj. So far, the first iteration of (|5| 
is completed and, based on mask Sj, the next iteration starts from (|8|. In 
our experiments, COROLA converges in approximately r iterations where r is 
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the rank of data in the sequence. Our convergence criterion is similar to 1301 
and we use ( energy prev — energy)/energy < 10 -4 , where energy = |||(Xj — 
Uv)\\ 2 p + /y^lli. In this formulation, the first and the second terms show 
the error of background model, and the foreground object size. The algorithm 
is considered to have converged if the error of background model and the size 
of the foreground object stabilize. In Algorithm |T] we summarize all steps of 
COROLA. 

3.f. Convergence of COROLA 

In this section, we explain the convergence criteria of COROLA. In gen¬ 
eral, our main objective function [5] is non-convex and we solve it by alternating 
between two steps. In step one for low-rank approximation, we always mini¬ 
mize a single lower-bounded energy function using OR-PCA. The convergence 
propoerty of OR-PCA has been proved in [5H] • In the second step for outlier de¬ 
tection, we use MRF and its convergence has been discussed in |41j . Using these 
two steps, the algorithm must converge to a local minimum; furthermore, 1301 
showed that this combinatorial optimization decreases the energy monotoni- 
cally through iterations and can converge to acceptable results in background 
modelling and moving object applications. 

3.5. Online Moving Object Detection with a Moving Camera 

In this part, we extend our moving object detection method to the case of a 
moving camera. As we mentioned in Section [TJ due to the dissimilarity between 
the first and the last images in a sequence, a batch method is not able to deal 
with continuous processing using a moving camera. However, in online methods 
the background model evolves with time and similarity between the first and 
the current image is not required. In our method, we build the background 
model for the current image and based on a transformation function between 
the current and the new image, the model is transformed to be matched with the 
new image. Then we can update it for the new image to detect the foreground 
objects. Note that the background model is transformed through time. So the 
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Algorithm 1 Online Moving Object Detection by COROLA 
1: Initialize: GMM parameters, /3i, /? 2 , 7, cei, r, A, and B 
2: for j = 1 : n 
3: Input data: Xj 

4: t = 1 and Si = 1, i — {1 ,mj 

5: repeat 

6: If t = 1 

7: A <— Aj- 1, 5 <s— , U <— Uj -1 

8: else 

9: i*— A jy B< — Bj, U <—Uj 

10: end If 

11: Vj argmin \\\Xj - UVj ||| + /3i \\Vj |||, where Xj = Vs(Xj), U = Vs,-.(U) 

v i 

!2: i + W. 4<—4-r+-W 

13: Uj <— argmin Tr[U(Aj + piI)U T ] - Tr(U T (Bj )) 

u 

14: Sj «— Xj — Lj, compute Fj over Ej from |40l 

15: Ej i — &Fj + (1 — <X)Fj 

16: Compute cost of assigning labels using Ej to optimize S 

17: S<—argmin /3 2 J2 s i + 7ll < £(' S 'j)l|i 

S i 

18: If /. = 1. 

19: 1 Uj <— Uj, 1 Vj «— Vj, 1 Aj <— Aj, and 'Bj <— Bj 

20: end If 

21: If t > r 

22: break 

23: else 

24: t < - t + 1 

25: end If 

26: until convergence 

27: Output: Sj, Lj = 1 Uj 1 Vj 

28: Uj «- 'Uj, Vj 'Vj, Aj <- 'Aj, and B, <- 'Bj 

29: end for 
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key in foreground detection using a moving camera is the transformation of the 
low-rank structure to the new input image. 

Let Tj be a transformation that maps X'j-\ to Xj. This transformation is 
obtained from an affine transformation estimated from the two 2D images. We 
also assume Xj -i = Uj-iVj-i and there is no changes into both images except 
for affine transformation so that Xj = toX,_i. For the sake of brevity, we state 
without proof that the following equation allows us to reconstruct the current 
view X 3 from the background model and the registration transform Tj. 


Xj = Tj O Xj—\ = ( Tj O Uj-l)Vj-l 


(16) 


From (16) the transformation only changes U . In fact, we need to transform B 
via r only once for the first iteration of each input image where Uj-\ = roUj—\ 


and -Bj-i = roBj_ 1 . In (12), A remains unchanged, because V is independent 


from r. Based on the above assumptions and (16), Uj = U 7 -i and V ? = Vj_ i. 
After the transformation, some elements of Uj-i and Bj- i, which are related 
to the pixels on the border of the current image, have no corresponding pixels 
and we have to estimate them using other pixels. To solve the problem, first we 
normalize both Uj -1 and the current image to [0,1]. Then, using Xj and Vj_i (or 
Vj) we estimate missing pixels of Uj— i by replacing them by the corresponding 
values obtained from [23] and ensure they lie in the correct range, as follows. 


Uj- 1 = X,vJ_ 1 (v,_ivJ_ 1 ) t (17) 

Similarly, for estimating missing pixels of transformed Bj—i, we normalize both 
Bj- 1 and Uj^VjVj and we replace those missing values of Bj-i with Uj-- t VjVj 
(from @) and ensure they lie in the correct range. 

Based on the experimental results, this approach can estimate missing pixels 
of U and B after transformation. In addition, the GMM for the previous £j-_ i 
should be transformed via r to match with the current E ;) . After transforming 
U,B , we can apply the COROLA method for a static camera to build the back¬ 
ground model and detect the foreground objects. Fig. [2] shows a sample image, 
its computed background model and extracted moving object via COROLA, 
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Figure 2: An example of COROLA for a moving camera, (a) input image from a sequence 
(b) background model (c) E , (d) E, (e) S, and (f) extracted foreground object using mask S. 
Red lines show the processing area. 


together with the intermediate results. 

The complexity of our sequential low-rank approximation by COROLA con¬ 
sists of contributions from two major parts. The computational complexity of 
the first part is 0(mr). The second part of the low-rank approximation in our 
model are 0(r 2 + mr) + 0(mr 2 ). Therefore, the overall complexity of COROLA 
for the low-rank approximation step is 0(r 2 + mr 2 ). 

4 . Experimental Results 

In this section, we compare COROLA with competing algorithms in the 
literature. We perform two sets of experiments on synthetic data and real 
benchmark datasets and show quantitative and qualitative results. For quan¬ 
titative evaluation where ground truth is available, we use pixel-level precision 
and recall, defined as follows: 

TP TP 

precision = Tp + pp , recall = Tp + pN (18) 
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where TP, FP, TN, and FN are the numbers of true positives, false positives, 
true negatives and false negatives, in pixels, respectively. Also, instead of using 
precision-recall curves, we use F-measure to show the overall accuracy. 


precision x recall 

F-measure = 2 -— 

precision + recall 


(19) 


4-1. Synthetic Data 

In this set of experiments, we use synthetic data to control noise and to 
show the capability of COROLA. The synthesized images are 30 x 100 pixels 
(m = 3000). We use n = 200 images. Zhou et. al. [30] used the similar scheme 
to investigate the robustness of their method against outliers. 

To visualize the results we show all images in a 2D matrix where each column 
shows one image of the sequence. We generate the input data D by adding 
a foreground to a background matrix B. For generating the foreground and 
background we use the same approach as DECOLOR. The background matrix 
B = UV is generated via U £ R mxr and V £ R rxn with random samples from 
a standard normal distribution. An object with a small size is superimposed 
on each image in matrix R, and shifts from left to right of the images by one 
pixel per image, until the right border of the image. The motion direction of the 
object is then reversed, and the process repeats. Fig. [3)[b) shows some selected 
images. The intensity of this object is independently sampled from a uniform 
distribution. Also, we add i.i.d Gaussian noise e to D with the corresponding 
signal-to-noise ratio defined as 

snr=M dA (20) 

y varye) 

Figs.ga), (b) and (c) show an example of generated B 1 the movement of gen¬ 
erated foregrounds and the obtained matrix D. 

We test the COROLA method and compare it with leading online methods 
such as GRASTA, OPRMF, ORPCA and DECOLOR, one of the best batch 
methods, in terms of different SNR ratios, different ranks of matrix, and different 
sizes of the foreground object. One sample of our experiments with different 
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Figure 3: An example of synthetic data, (a) shows matrix B £ J^ 3000 x 200 , with m = 3000, 
n = 200, and rank r = 5, where B = UV, U E ^3000x5^ an( j y ^ j^5x200 ^ shows some 
sample images from selected column of B , where an object is superimposed each of them. The 
object is represented by a red box in the first image in (b). other images show the movement 
of the object to left and right of the image, frequently, (c) shows a sample of generated matrix 
D as the input data. 


SNR ratios between COROLA and all mentioned methods is shown in Fig. [4j 
In the first row of Fig. [4j with SNR = 10, COROLA, OPRMF and DECOLOR 
methods have roughly the same results for extracting the foreground object, but 
when we increase noise in the second row (SNR = 1), COROLA method works 
better than all other methods including DECOLOR in extracting the moving 


Input Matrix D Ground Truth GRASTA OR-PCA OPRMF DECOLOR COROLA 



Figure 4: Comparison of COROLA, GRASTA, ORPCA, OPRMF and DECOLOR with 
different SNR ratio. The first row and the second row show the results of the methods with 
SNR = 10, and SNR = 1, respectively. 
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object. That is mainly attributed to using GMM to compute the coefficients of 
outliers to separate the foreground object from noise. Tuning up the outliers 
coefficient via GMM enables us to separate noise and outliers especially in a 
noisy environment and the result becomes more and more accurate over time. 

To evaluate COROLA in comparison with GRASTA, OPRMF, ORPCA, 
and DECOLOR methods, we tested the effects of some scene parameters such 
as SNR, rank of matrix D , and size of the object. The quantitative results of 
this comparison in terms of F-measure are provided in Fig. [5] The first column 
of Fig. [5] illustrates the effect of noise in all methods, when we change the SNR 
ratio from 8 to 1 in different ranks. The rows from top to bottom show our 
experiments in different ranks of 1, 3, and 5. Since one of the advantages of 
DECOLOR method is high accuracy of object detection with different sizes, 
the second column of Fig. [5] shows the accuracy of COROLA in comparison 
with DECOLOR to extract the moving object of different sizes. This result 
demonstrates that the capability of our method is comparable with DECOLOR 
in terms of average F-measure. Although, the result of DECOLOR method is 
slightly more accurate than COROLA for large objects, by reducing the size 
of object, COROLA generates a better result than DECOLOR even when we 
increase the rank of matrix D from 1 to 5. 

To evaluate the rank sensitivity of the COLORA method, we tested the 
effects of changing the rank of our method against other online methods. Fig [6] 
demonstrates F-measure of these methods, when we set the rank of the methods 
from 1 to 50. The columns from left to right show our experiments in different 
SNR 2, 4, and 8 on the synthetic data with the true rank 3. In this experiment, 
when we set the rank of methods less than the true rank of the data, ORPCA, 
OPRMF, and GRASTA failed. This is because when the rank of data is higher 
than the predefined rank, these methods consider some of background variations 
as positive foreground pixels, incorrectly. In contrast, COROLA can extract 
foreground objects even with a lower rank than the true rank. It is because 
GMM allows COROLA to remove false positive pixels. In Fig. [6] by increasing 
the predefined rank of the methods, GRASTA, OR-PCA, and COROLA are 
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Figure 5: First column: the comparison in terms of F-measure between COROLA and other 
methods with different signal-to-noise (SNR) ratio. Second column: the comparison of F- 
measure between COROLA and DECOLOR with different object size. The three rows show 
three different ranks at 1, 3, and 5 respectively. 


robust although COROLA still shows the highest F-measure against all other 
methods. 

4-2. Real Data 

In this section, we use real benchmark datasets to conduct quantitative and 
qualitative evaluation of COROLA and compare it with DECOLOR, MOG, 
SSGoDec, and ORPCA+MRF. The real datasets used are popular in moving 
object detection and publicly availably and they include “2014 Change Detec- 


1 https: //sites.google.com/site/backgroundsubtraction/test-sequences 
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Figure 6: Comparison of F-measure between COROLA and other methods in different ranks. 
True rank is 3, and columns show the results with different SNR. 


tion” [13], “Perception or I2R” [H] , and “Wallflower” [35] test images sequences. 
Table [l] provides the length and image size of these datasets. 

Table 1: Details of all sequences used in our experiments for stationary camera 


Dataset 

Sequences 

Size X ^frames 

I2R 

Water surface 

[160,128] 

X 48 


Fountain 

[160,128] 

X 523 


Curtain 

[160,128] 

X 2964 


Hall 

[176,144] 

X 1927 


Campus 

[160,128] 

X 372 


Escalator 

[160,130] 

X 824 


Lobby 

[160,128] 

X 138 


ShoppingMall 

[320,256] 

X 433 

Change Detection 

Canoe 

[320,240] 

x 1189 


Fall 

[180,120] 

x 1500 


Fountain02 

[216,144] 

X 720 


Overpass 

[320,240] 

x 3000 

Wallflower 

Waving trees 

[160,120] 

X 287 


Bootstrap 

[160,120] 

X 299 


Camouflage 

[160,120] 

X 251 


Foreground Aperture 

[160,120] 

X 489 


TimeOfDay 

[160,120] 

x 1850 


,[.2.1. Evaluation by accuracy 

Figs. 01 and [9] show the qualitative results of COROLA for background 
estimation and foreground detection for all sequences of Table [T] from three 
datasets I2R, Change Detection, and Wallflower, respectively. Figs. 00 and [9] 
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(a) (b) [c) (d) (e) (f) 


Figure 7: The results of COROLA on 6 sequences from three detasets Change Detection, 
I2R, and Wallflower. Columns (a) and (b) show the original query image and the ground truth 
(GT) for the foreground. Columns (c) and (f) show the results of COROLA for estimating 
the background L, and the detected foreground objects S , respectively. Columns (d) and (e) 
show intermediate results for outliers E, and E, respectively. 


also shows the role of GMM to separate outliers from noise. These results are 
shown in columns (d) and (e) as E, and E, respectively. The results in Figs. [tJ[ 8j 
and [9] demonstrate the capability of COROLA to detect moving objects and 
background modelling accurately. The estimated background in the first row 
of Fig. [ 7 ] has some ghost because the input image is the 23 rd of the sequence 
and the parameters have not been learned well enough to build an accurate 
background. In general, for short sequences the computed background model 
by a batch method such as DECOLOR is more accurate than COROLA because 


24 





















Figure 8: The results of COROLA on 4 sequences from Change Detection dataset. Columns 
(a) and (b) show the original query image and the ground truth (GT) for the foreground. 
Columns (c) and (f) show the results of COROLA for estimating the background L, and the 
detected foreground objects S, respectively. Columns (d) and (e) show intermediate results 
for outliers E, and E, respectively. 


online methods need sufficient samples for training to be stable. However, for 
long sequences COROLA can provide comparable results with batch methods. 

We also compare COROLA quantitatively with competing online and batch 
methods. Table [2] compares COROLA with MOG, GRASTA, OPRMF, and OR- 
PCA in terms of F-measure. In most of the cases COROLA works much better 
than all other online methods, specifically in very noisy and dynamic scenes 
such as Fountain, Campus, Canoe, Fall, and Fountain02 sequences. Because in 
these sequences moving parts of background are often classified as foreground 
in other online methods. In contrast, COROLA is able to deal with the dif¬ 


ficult background conditions. By using GMM and (14) the difference between 
outliers and the rest of pixels is boosted and this allows COROLA to detect 
intermittently moving objects better than other competing online methods. 

To show the capability of COROLA, we have also included “OR-PCA+MRF” 
m in our evaluation. Even though this method sets manually all parameters 
for each sequence, since this approach does not use an optimization framework, 
it does not perform as well as COROLA in most of the sequences. 


25 



















(a) (b) (c) (d) (e) (f) 


Figure 9: The results of CORO LA on 5 sequences from Wallflower dataset. Columns (a) and 

(b) show the original query image and the ground truth (GT) for the foreground. Columns 

(c) and (f) show the results of COROLA for estimating the background L, and the detected 
foreground objects S, respectively. Columns (d) and (e) show intermediate results for outliers 
E , and E , respectively. 


Table [3] compares COROLA with IALM, FPCP, GoDec, SSGODec, APG, 
and DECOLOR, which are fast and accurate batch methods in the literature, 
in terms of F-measure. For some sequences such as Fountain, Campus, Canoe, 
Fountain02, Overpass, and TimeOfDay COROLA works much better than other 
methods. Because in some of these sequences, background is very noisy (i.e. 
Campus and Fountain02), the constraints of connectedness and sparseness on 
the subspace of images prove to be useful, which both DECOLOR and COROLA 
methods exploit leading to much better results than other methods. Further, 
in some cases the objects move very slowly (i.e. Canoe) or stop for a long 
time (Overpass, Fountain, and TimeOfDay) none of the competing methods can 
produce accurate results. In contrast, COROLA produces acceptable results for 
these challenging sequences for the same reasons as for the results of Table [2j 


i.e., using GMM and (141 the difference between outliers and the rest of pixels is 
boosted and so COROLA can detect intermittently moving objects better than 
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Table 2: Comparison of F-measure score between COROLA and online methods 


Sequence 

MOG 

GRASTA 

OPRMF 

ORPCA 

ORPCA 

+MRF 

COROLA 

WaterSurface 

0.4723 

0.7531 

0.5483 

0.6426 

0.9166 

0.9503 

Fountain 

0.7766 

0.4978 

0.2393 

0.2870 

0.8283 

0.9175 

Curtain 

0.7709 

0.7046 

0.4199 

0.8504 

0.8920 

0.9038 

Hall 

0.5802 

0.7471 

0.7215 

0.7329 

0.7844 

0.8298 

Campus 

0.4510 

0.1885 

0.1700 

0.1893 

- 

0.7650 

Escalator 

0.3869 

0.5474 

0.5179 

0.4452 

- 

0.7714 

Lobby 

0.5628 

0.8231 

0.6728 

0.6336 

0.8081 

0.8129 

ShoppingMall 

0.5275 

0.6816 

0.6621 

0.5541 

- 

0.7452 

Canoe 

0.5114 

0.5386 

0.4400 

0.5152 

0.8534 

0.8901 

Fall 

0.5420 

0.5057 

0.4929 

0.4030 

- 

0.8596 

Fountain02 

0.7801 

0.3569 

0.2926 

0.4684 

0.8517 

0.8642 

Overpass 

0.5095 

0.5609 

0.5105 

0.6079 

0.8272 

0.8471 

WavingTrees 

0.6639 

0.7354 

0.5259 

0.6315 

0.8689 

0.8688 

Bootstrap 

0.4613 

0.5635 

0.5627 

0.5619 

- 

0.6930 

Camouflage 

0.6922 

0.2191 

0.6525 

0.2307 

0.9118 

0.8738 

Foreground Aperture 

0.2601 

0.6757 

0.5628 

0.6118 

0.6910 

0.6709 

TimeOfDay 

0.6147 

0.5645 

0.5258 

0.6315 

- 

0.8344 


other methods. In summary, Tables [2] and [3] convincingly demonstrate that our 
method outperforms the state-of-the-art in terms of F-measure. 

4-2.2. Computational Cost 

COROLA is implemented in Matlab and C++. We run all experiments on 
a PC with a 3.4 GHz Intel i7 CPU and 16 GB RAM. To show the importance of 
online methods in continuous operation we compare the scalability of COROLA 
with DECOLOR under varying spatial resolution and the number of images. 

Unlike DECOLOR, the computational cost of COROLA is independent of 
the number of images because the dominant cost of DECOLOR comes from 
the computation of SVD in each iteration. By increasing the size of the matrix 
D , the computation time of DECOLOR grows at least linearly with respect to 
the number of images. We compare the computation time of COROLA with 
DECOLOR after convergence of both methods in Table [4j In this table, the 
average time for processing of each frame by DECOLOR increases where it is 
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Table 3: Comparison of F-measure score between COROLA and batch methods 


Sequence 

IALM 

FPCP 

GoDec 

SSGoDec 

APG 

DECOLOR 

COROLA 

WaterSurface 

0.3519 

0.4910 

0.4304 

0.4473 

0.5907 

0.9022 

0.9503 

Fountain 

0.1633 

0.1894 

0.1531 

0.2574 

0.2641 

0.2075 

0.9175 

Curtain 

0.3184 

0.5290 

0.3706 

0.4344 

0.7260 

0.8700 

0.9038 

Hall 

0.5716 

0.7295 

0.7128 

0.5713 

0.7601 

0.8169 

0.8298 

Campus 

0.1660 

0.1701 

0.1640 

0.1649 

0.1979 

0.7811 

0.7650 

Escalator 

0.5066 

0.5192 

0.1316 

0.5075 

0.5440 

0.8205 

0.7714 

Lobby 

0.3213 

0.7188 

0.7393 

0.6194 

0.7286 

0.6579 

0.8129 

ShoppingMall 

0.6093 

0.6256 

0.6143 

0.5880 

0.7057 

0.6382 

0.7452 

Canoe 

0.5072 

0.5169 

0.5107 

0.3091 

0.4193 

0.1603 

0.8901 

Fall 

0.4112 

0.4191 

0.4137 

0.4236 

0.5232 

0.8760 

0.8596 

Fountain02 

0.2553 

0.3066 

0.2713 

0.2714 

0.3204 

0.8327 

0.8642 

Overpass 

0.5492 

0.5528 

0.5454 

0.5517 

0.5698 

0.3573 

0.8471 

WavingTrees 

0.5130 

0.5130 

0.5113 

0.1829 

0.7031 

0.8845 

0.8688 

Bootstrap 

0.6517 

0.6525 

0.6490 

0.5567 

0.5619 

0.6342 

0.6930 

Camouflage 

0.6518 

0.6518 

0.6428 

0.6426 

0.3441 

0.3661 

0.8738 

F-A 

0.3233 

0.3238 

0.3238 

0.6854 

0.7200 

- 

0.6709 

TimeOfDay 

0.1523 

0.2187 

0.1630 

0.1664 

0.6808 

0.4683 

0.8344 


an order of magnitude slower than COROLA for sequences longer than 1000 
images. 

Scalability in spatial resolution is another advantage of online method against 
batch processing methods. Increasing the resolution of images significantly af¬ 
fects DECOLOR method. Using high resolution images results in a huge ma¬ 
trix D so that decomposing D becomes very expensive. On the other hand, 
COROLA is an online method and is independent from the number of images, 
i.e., we do not have to deal with a large D and its computation time grows only 
with the image resolution. 

4-3. Experiments on a Moving Camera 

In this section, we test our method on real public sequences for moving 
cameras namely “Berkeley motion segmentation dataset” [3B] . Table [5] shows 
the details of five challenging sequences that we use in our experiments. 

We compare our method with DECOLOR as the leading method based on 
low-rank approximation that can handle the problem of object detection with 
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Table 4: Time evaluation of COROLA with DECOLOR method 


Methods 

Resolution x ^images 

Low Rank (s) 

MRF (s) 

Total (s) 


[320 x 240] x 200 

0.1036 

0.0828 

0.1864 


[320 x 240] x 400 

0.1531 

0.1297 

0.2828 

DECOLOR 

[320 x 240] x 600 

0.1687 

0.1601 

0.3279 


[320 x 240] x 800 

0.2016 

0.1825 

0.3841 


[320 x 240] x 1000 

0.3948 

0.3191 

0.7139 

COROLA 

[320 x 240] x 1000 

0.0231 

0.0605 

0.0836 


a moving camera in a short sequence. Although recently, He et al. (32j has 
proposed transformed-GRASTA, it only works well for camera jitter and it is 
not appropriate for moving camera. Fig. [To] shows the qualitative results of 
COROLA in comparison with DECOLOR method for moving object detec¬ 
tion using a moving camera. First three experiments have been performed on 
short sequences “cars6”, “cars7”, “peoplel” and the results from COROLA are 
comparable with those from DECOLOR method. For the last two sequences 
“marplel3” and “Tennis”, DECOLOR has a problem to align images when the 
last images are not similar with the first images of these sequences. This is com¬ 
mon in continuous processing and all of batch methods have problem with this. 

Table 5: Details of all sequences used in our experiments for moving camera 


Dataset 

Sequences 

Size x ^frames 


cars6 

[320,240] 

X 

30 


cars 7 

[320,240] 

X 

24 

Berkeley motion segmentation 

peoplel 

[320,240] 

X 

40 


tennis 

[320,240] 

X 

200 


marplel3 

[320,240] 

X 

75 
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Figure 10: Comparison of foreground objects between DECOLOR and COROLA. columns 
(a) and (b) show the input image and its ground truth, columns (c) and (d) show the obtained 
foreground mask for DECOLOR and COROLA methods, respectively. 


To show the result of DECOLOR on marplel3 and tennis sequences (in the last 


two rows of Fig. 10), we used last 30 images of the sequences, which have less 
camera motion. Since the last images in the sequence are no longer similar to 
the initial ones in the matrix, DECOLOR failed, as expected. In contrast, since 
COROLA works online and only considers the last two images it can process 
the last two sequences of Table [5] without any problems and provides acceptable 
results in comparison with DECOLOR. For completeness, we have also included 
in our comparative study another online registration based method in [3]. 

Table [b] shows the quantitative evaluation of COROLA in comparison with 
DECOLOR and the method in [5]. Experiments over all five sequences show 
that the results of COROLA is comparable with DECOLOR for the last 30 
images of a sequence but has the advantage in terms of its ability for real-time 
continuous processing. With more than 30 images in a sequence, DECOLOR 
can no longer produce a valid result due to the significant dissimilarity of the 
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Table 6: Comparison of F-measure score. 1 last 30 images is used 


Sequence 

FFD based model 

DECOLOR 

COROLA 

cars6 

0.8870 

0.9052 

0.9409 

cars 7 

0.8257 

0.8441 

0.8867 

peoplel 

0.8122 

0.9666 

0.9056 

tennis 

0.8494 

0.8404 1 /NA 

0.8642 

marple!3 

0.6407 

0.8063 1 /NA 

0.8271 


images later in the sequence from the initial ones. In contrast, our sequential 
method is always able to produce a valid result often with higher accuracy. 

5. Conclusion 

In this paper, we have proposed a novel online method named COROLA to 
detect moving objects in a video using the framework of low-rank matrix ap¬ 
proximation. Our online framework works iteratively on each image of the video 
to extract foreground objects accurately. The key to our online formulation is 
to exploit the sequential nature of a continuous video of a scene where the back¬ 
ground model does not change discontinuously and can therefore be obtained 
by updating the background model learned from preceding images. We have ap¬ 
plied COROLA to the case of a moving camera. Since our method works online 
and is independent of the number of images, it is suitable for real-time object 
detection in continuous monitoring tasks. Our method overcomes the problems 
of batch methods in terms of memory storage, time complexity, and camera 
motion. Also important to the success of COROLA is using Gaussian model 
to separate noise from outliers and also to tune the costs of assigning labels in 
MRF via a and weights of Gaussian parameters, dynamically and automatically 
especially when the object moves very slow or stops for some frames. Based on 
our extensive experiments on synthetic data and real data sequences, we are able 
to establish that COROLA archives the best performance in comparison with 
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all evaluated methods including the state-of-the-art batch and online methods. 

Despite its satisfactory performance in all of our experiments, COROLA 
shares one disadvantage with DECOLOR. Since both methods have non-convex 
formulations, they might converge to a local minimum with results depending 
on initialization of parameters; however, for the case of background modeling, 
images are roughly similar and parameters do not change significantly. There¬ 
fore, the issue of local minimum has not affected successful object detection in 
our experiments. A challenge facing COROLA is severe illumination changes 
and this is a problem of all online methods. In the future, we plan to develop a 
version of COROLA that can work under severe illumination changes. 
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